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Abstract. We use nowdays classical theory of generalized moment 
i—{ • problems by Krein-Nudelman [9] to define a special class of stochastic 

Gaussian processes. The class contains, of course, stationary Gaussian 

QQ , processes. We obtain a spectral representation for the processes from 

this class and we solve the corresponding prediction problem. The or- 
thogonal rational functions on the unit circle lead to a class of Gaussian 

Ph . processes providing an example for the above construction. 



c^ 



Introduction 



It is a classical fact that a (discrete-time) stationary Gaussian process 
admits a spectral representation which allows one to transfer the study of 
various characteristics connected to the process to the study of the shift 
^ ' operator on L^(d/i), fj, being a positive Borel measure on the unit circle. The 

f^ . measure /u is called the spectral measure of the process. The considerations 

pertaining to the geometry of the space L^ (d/x) provide striking applications 
of the theory of orthogonal polynomials on the unit circle, see Simon [10] 
for exhaustive treatment of the subject and its applications. 

f^ ' The purpose of the present paper is two-fold. Being interested in orthog- 

^^ . onal rational functions on the unit circle (ORFs, for the sake of brevity), 

the authors wanted to understand what role these systems of functions play 
with respect to stochastic Gaussian processes. That is, is there a spectral 

K> , representation of certain stochastic processes via ORFs? What kind of sto- 

Jj I chastic processes do arise in this way? How does one formulate (and solve) 

prediction problems for these processes? etc. A brief discussion of some 
aspects of this problem can be found in Dewilde-Dym [6]; the authors say 
that, for a stationary Gaussian process, one can apply the spectral represen- 
tation theorem (see Theorem 0.3). The past/future is given by subspaces 
looking like X~ = lin {t^ : k < n} and Af+ = lin {t^ : k > n + I}. Roughly 
speaking, they suggest however to consider "deformed" spaces £„ (see Sec- 
tion 0.1), and, following the parallels with classical prediction problems, to 
study the projections to £„ instead of X^. The question is very natural 
from the analytical point of view and leads readily to the introduction and 
the study of ORFs. Nevertheless, the probabilistic meaning of Cn seems to 
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be rather " opaque" . One of questions we wanted to look at was to find a 
probabilistic interpretation for the described procedure. 

Second, the book by Bultheel et al. [4] treats, besides other interesting 
topics, the moment problems coming from ORFs (see Chapter 10). This 
made us think of a monograph by Krein-Nudelman [9], entirely devoted to 
generalized moment problems, descriptions of their solutions, the study of 
extremal solutions and different variations on the topic. The strong appeal 
was to compare the results of both books and to attempt to "mix up" their 
approaches. 

Following Krein-Nudelman, we introduce rather general class of moment 
problems and obtain the necessary and sufficient condition for their solv- 
ability (Section 1). The questions on determinacy/indeterminacy of the 
problem, the description of solutions and the study of extremal solutions 
are left aside. Then we define a class of varying Gaussian processes and 
we obtain their spectral representation. The term "varying" is intended 
to be synonymous to "generalized stationary", but seems to better reflect 
the situation. It means that the covariance matrices of the process change 
accordingly to a concrete rule determined by a sequence of external (and 
fixed) parameters. Corresponding prediction problems are formulated in a 
natural way; their solutions involve generalized orthogonal polynomials (or 
orthogonal Laurent polynomials). We specify the construction for the case 
of ORFs in Section 3. This supplements [4, Ch. 10] with a solvability cri- 
terion (where the solvability conditions are not discussed) and provides an 
unified and natural approach to the problems of this type. 



0.1. Generalized moment problems. Chapter 10 of the book by Bultheel 
et al. [4] is devoted to the study of a special moment problem generated by 
the ORFs. The considered problem is a particular case of the concept of 
a generalized moment problem, suggested by Krein-Nudelman in monograph 
[9]. We adapt the latter point of view for our presentation since we believe 
that this "generalized" approach allows one to gain in clarity as well as in 
generality. 

The content of this subsection is borrowed from the monograph by Krein- 
Nudelman [9, Ch. 1-3]. The definitions below are given for reader's conve- 
nience; we refer to the book [9] for an extensive discussion and deep results 
on the subject. See also Grenander-Szego [7] for a classical presentation of 
a classical version of the topic. 

All objects appearing in the first part of this subsection are real-valued. 
Let A^(T) be the set of positive finite Borel measures on T. Let it = 
{■u;i.};j=o,...,oo be a system of continuous linearly independent real-valued 
functions defined on T = {\z\ = 1} (or an interval of M). Put Cn = Cn.sx to 
be the linear span of uq, . . . , tin- 

Let the system il satisfy the following property: there are coefficients 
{a'fcjfc, a'fc £ I^, such that 



(0.1) ^4t/fc>0 



fc=0 
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on T. Let G^ : £„ — t- M be the linear functional 

n n 

k k 

where Ck = ^{uk) G ^- The functional is called positive, if C(/) > provided 
/ G J~-n, f > on T. The sequence {ck}k=o,...,oo is called positive (w.r.to il), 
if it defines a positive functional. 

The following theorem is of fundamental importance. 

Theorem 0.1 ([9], Ch. 3, Theorem 1.1). Let the system it satisfy (0.1). 
The following assertions are equivalent: 

- the functional € is positive (i.e., the sequence {ck}k is positive), 

- {c/j} is a sequence of generalized moments, that is, there is a a (z 
M{T) such that 



Cfc = / Ukda. 
The same result verbatim holds for infinite systems {uk}k=o,...,oo and 

{Cfc}fc=o,...,oo- 

We now switch to complex-valued objects. Let 21? = {wk}k=o,...,oo be 
a system of continuous complex-valued functions on T. We require that 
{wk}k^{wk}k be linearly independent (w.r.to C). Setting il = {uk}k=Q,..., oo-, 
^ = {^fc}A;=o,...,oo with Uk = Rewk,Vk = Imf^, we rewrite the real-valued 
functional <t defined originally on >Cn,iiuaj as a complex-valued linear func- 
tional (denoted by the same letter) on Cn,w defined as £(ufc + ivk) = 
^{uk)+i''^ivk)- It is convenient to put w^k = Wk and one sees ^{wk) = ^{wk). 

From now on, Cn^w is abbreviated as £„. The functions from £„ might 
be thought of as "analytic polynomials" of degree n; the space £„ + Cn 
contains "trigonometric polynomials" of degree n. 

Theorem 0.2 ([9], Ch. 3, Theorem 1.3). Let the system W satisfy 

n 

(0.2) y^(afcWfc +Wwk) > 

A;=0 

for some coefficients {cfc}, a^ € C. Then the following assertions for {ck}k, Ck ^ 
C, are equivalent: 
- The relation 

(0.3) ) ]{akWk + aWk) > 



n 

E( 

fc=0 



implies that 



^(ofcCfc + akCk) > 0. 
fe=o 
- The coefficients {ck}k a^e generalized moments (w.r.to W), that is: 
there exists a measure a G A4(T) with the property 

Ck = Wkda. 
The proof is by application of Theorem 0.1 to the real part of J2k=o ^kWk- 
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0.2. Spectral representation for stationary Gaussian processes. We 

recall some highly standard facts from the theory of stochastic processes; 
much more information on the topic is, for instance, in the monographs by 
Breiman [2] or Lamperti [8]. 

Let {0,J^, P} be a probability space and {Xn}nez,Xn : il — )■ C be a 
family of random variables. The system X = {Xn}n is called a stochastic 
process. We assume that E{Xn) = 0, E{\Xn\'^) < oo. 

Suppose now that Xn have joint normal distribution; then {X„} is called 
a (stochastic) Gaussian process. For {Xm, ■ ■ ■ ,X„^} define the covariance 
matrix by 

where j,k = 1, . . . ,m. The process is stationary if the covariance matrix is 
invariant w.r.to the shift, i.e., 

(-^ni,...,nm — ^ni+l,...,nm+l- 

A "stationary Gaussian process" will be often abbreviated as SGP or SG- 
process. 

It follows from Herglotz' theorem (see [2, Theorem 11.19]) that there is 
the unique fi € A^(T), called the spectral measure of the process, with the 
property Cjk = f^t^~^d^{t). 

The following theorem uses the concept of stochastic integral (see [2, 
Ch. 11, Sect. 6] once again) and it gives the spectral representation of a 
SGP. 

Theorem 0.3 ([2], Theorem 11.21). Let {X„}„ he a SGP and // be its 

spectral measure. Then there exists a unique family of random variables 
Z^ = Z{.,^),^ £ T (i.e., Z = Z{u;,^) with w G i7,^ G Tj having the 
properties: 

(1) For {^1, ■ ■ ■ ,Ck},Cj £ T, ^j / ^j, the random variables {Z^.}j are 
jointly normally distributed. 

(2) For I = [^1,^2) C T, one writes Z(I) = Z^j ~ ^^i '^''^d, 

E{\Z{I)\') = f,{I), E{Z{h)Z{h)) = 0, 

for /i n /2 = 0. 

(3) Finally, 

Xn= f edzi.,0- 

Denote by -L^(X) the closed linear span of X; this is a subspace of L'^(dP). 
Moreover, -L^(X) is a Hilbert space with the scalar product {X, Y) = E{XY). 
The theorem says that the map U : -^^(X) -^ Lp'{d^) defined by U{Xn) = C" 
is a unitary one. If Z : L^(X) — > -L^(X) acts as ZX^ = X„_|_i, its image 
under conjugation by U (denoted by the same letter) is the usual shift, 

zf{t) = tf{t),feL\d^x). 

It is not difficult to guess that the classical orthogonal polynomials w.r.to 
fi appear in this framework as solutions of forward/backward prediction 
problems, see [4, Ch. 12]. One of the purposes of this paper is to show that 
the ORFs play a similar role for a class of varying Gaussian processes, see 
Section 2. 



schur algorithm, orfs, gaussian processes and prediction 5 

1. Generalized moment problem revisited 

We specify a little the construction from Section 0.1 to fit our needs. The 
essential part of the business is in [9, Ch. 1,3]. Let 2IJ = {wk}k be a system 
of functions described above and satisfying (0.2). Suppose that 2B has few 
additional properties: 

(1) wo = l, 

(2) One has WjWk € Cj + C^, or, equivalently, 

k 

(1.1) iVjWk = ^ /3jk,sWs, 

s=-j 

where {f3jk,s}s are some coefficients and, as before, W-s = Ws- 

(3) We have the following factorization property: if 

n 

^(ofcWfc +Wwk) > 

k=0 

on T, then necessarily 



y^bkWk 



fc=o 



(1.2) y^jakWk + akWk) 

fc=o 

for some {bk}. 

Notice that the coefficients {/?jfc,s} are uniquely determined by the system 
W. 

For a cr € ^A{T), the matrix C„ = C„^2jj,o- = [<^jk]j,k=o,...,n is defined as 



(1.3) Cjk = / WjWkda. 

Jt 

Since wq = 1^ one has CQk = Ck and, moreover, 

k 

(1-4) Cjk = ^ Pjk,sCs, 

s=-j 

where, once again, c_s = Cg and obviously Cjk = Ckj- We say that a matrix 
Cn = Cn,w = C* > is a generalized Toeplitz matrix w.r.to W (21J-GTM, 
GTM or GT-matrix, for short), if its entries satisfy relations (1.4). An 
interesting and important question is whether all 2If-GT-matrices come from 
aa gM{T). 

The answer (also explaining the terminology) is given by a slightly mod- 
ified version of Theorem 0.2. It says, in particular, that relations (1.3) and 

(1.4) define the same object. Everything below is "w.r.to W" . 

Theorem 1.1. The following assertions are equivalent: 

(1) the GT-matrix C = [cjk]j,k=o,...,n *s positive, C > 0, 

(2) the sequence {ck}k=o,...,n is positive (in the sense of implication (0.3) J, 

(3) there is a measure a G A^(T) with the property 



Cjk = / WjWkda. 

It 
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(3') one has 



Ck= Wkda. 

Proof. It is plain that (3) and (3') are equivalent. Indeed, (3) implies (3') 
since Ck = cqu = JjWkda. Conversely, let Ck = jjWkda for any k. So, by 

(1.4) and (1.1), 

^jk = yZ Pjk,sCs = yZ l3jk,sWsda = / WjWkda. 
s=-j ■^^s=-, J-^ 

Now, claims (2) and (3') are equivalent by Theorem 0.2. Claim (3) yields 
(1) by the standard argument (see, for example, [9, Ch. 3]). The only 
implication to prove is that (1) =^ (3'), for instance. 

By Theorem 0.2, we have to prove that if Ylt=o(^k''JJk + akWk) > for 
some {afc}, then necessarily Yl^=o('^kCk + flfcCfe) > 0. By (1.2), one has 

n n 

yZ(.°'kWk + akWk) = ^2 bjbkWjWk, 

k=0 j,k=0 

and, consequently, 

n n 

^(flfeCfc + op^) = 'y2(^kT{wk) + dkT{w^k)) 
fc=0 fe=o 

n n k 

= ^2 bjbkT{tJdjWk) = ^2 h^k ^2 f^jk,sT{Ws) 
j,k=0 j,k=0 s=—j 

n k n 

= ^2 ^^^^ X] /3jfe,sCs = ^2 ^j^kCjk > 0. 
j,k=0 s=—j j,k=0 

The proof is complete. D 

Let 2n = {wk},W = {w'l^} be two systems having the above properties 
plus that Cn = lin{wk}k=o,...,n = lin{w'f^}k=o,...,n- Suppose also that 

k 

(1.5) Wfc = y^dskWs, 

s=0 

dkk ¥" 0) so that the matrix D = D<xc;~^w' = [dsk]s,k=o,...,n of the change of 
the basis is upper triangular and non-degenerate. The GT-matrices Cn,w 
and Cn^w ^^^ connected in the obvious way 

(1.6) Cn,w' = D*Cn,wD. 

Certainly, this is an equivalence relation and we may get from 22J' to 2H if 
necessary. 

2. Varying Gaussian processes 

2.1. Spectral representation for the process. The point is to carry the 
construction of Section 0.2 over the processes whose covariance matrices are 
generalized Toeplitz matrices in the sense of the previous section. 
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So, let 2B be a fixed (infinite) system of functions. In addition to assump- 
tions (l)-(3) on page 5, we suppose tliat tlie family 2ir is separating, that 
is: 

(4) If, ior a fi £ M (T) , one has 

Wkdfj, = 

/T 
for all k, then ^ = 0. 

Before giving the definitions, let us discuss the terminology we use. Can- 
didate labels for the objects introduced below, were "generalized stationary 
Gaussian processes" and "non-stationary Gaussian processes". It turned 
out, however, that these names led to some misunderstanding instead of 
clarifying the picture. For this reason, we stick with the name "a varying 
Gaussian process". This sentence means that the statistics (e.g., covariance 
matrices, etc.) of a Gaussian process under consideration vary accordingly 
to a prescribed law depending on a system of external parameters we may 
choose up to some extent. Of course, the class contains stationary Gaussian 
processes. We do not imply at all that the processes from the above class are 
non-stationary in a "wild" sense (i. e., the statistics of the process change 
absolutely arbitrarily with the shift of the index) . 

So, we say that X is a varying Gaussian process (abbreviations: a SH-VGP, 
a VGP, or a VG-process), if C = Cni,...,nm = C'(-^ni) • • • ^um) is a generalized 
Toeplitz matrix w.r.to {wm , ■ ■ ■ , Wn^}- We denote by C = Cx = {Cn}n the 
sequence of the covariance matrices of the process; they are SH-GT matrices 
by definition. Theorem 1.1 implies that there exists the unique a € A^(T) 
such that 

C = C„i,...,„„ = / WnjWn^da 

A counterpart to Theorem 0.3 in our situation is as follows. 

Theorem 2.1. Let X. be a VGP and a be its spectral measure. Then there 
exists a unique family of random variables Z^ = Z{.,^),^ G T with the 
properties: 

(1) For {S,i, ■ ■ ■ ,S,k},S,j G T',^j 7^ S,j, the random variables {Z^.}j are 
jointly normally distributed. 

(2) For I = [^1,^2) C T, one writes Z{I) = Z^^ — Z^^ and 



E{\Z(l)f) = a{I), E{Z{h)Z{h)) = 0, 



for h n h 
(3) Finally, 



f Wn{i)dZ{.,. 



Xn= / Wn{i)dZ{.,i) 



As in Section 0.2, one has the unitary map U : ^^(X) — > £ = clos lini^2(^i^^^{wk} 
Lp'{da) (by the separation property (4) on page 7) and U{Xn) = Wn. The 
shift defined as ZXn = Xn+i goes to 2{'^f,akWk) = J2k'^k'Wk+i', the sum 
of course is finite. 
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Proof. The argument follows the proof of Theorem 0.3 (= [2, Theorem 
11.21]) with Bochner's theorem replaced by Theorem 0.2. We give it for 
the completeness of the presentation. 

Let a be the spectral measure of a 2If-VG-process. Consider L^(X) = 
closlin^2Mp)^', this is obviously a Hilbert space w.r.to the scalar product 
{X, Y) = E{XY). Define a map U : L^{X.) -^ L'^{da) by the relation UX,, = 
Wn and extend it by linearity to finite linear combinations of {X„}. The map 
is obviously isometric, since for Yi = J2k'^i,k^k, and fi = J2k'^i,k'Wk, i = 
1,2, one has 



(^1,^2) = E{Y,ai,ka2,jXkXj 

k,j 



2_^ai,ka2,jE{XkXj) = y ^ ai,kO'2,jCjk 

k,j k,j 



ai,ka2,j / WkWjda = (S2ai^kWk,y2"-'^d'^j)L'2(da) 
J^ k k 

ifl,f2)L^{da)- 



E 

k,j 



We extend the map U by continuity to act from L^(X) to L'^(da). It is easy 
to see that it is one-to-one and hence unitary. 

Denote the (clockwise) arc [1,C) C T by I^. Since x/^ ^ L'^{da), we set 
Z^ = U~^{xif) S L^(X), where xic is the indicator function of I^. We now 
verify the properties of Z^ claimed in the formulation of the theorem. Let us 
start with (1): let {Y"'} = {{Y{^, . . . , 1^^)1^=0,1,. .. be a sequence of vector- 
valued random variables. Assume that (Y"", . . . , Y^) are jointly normally dis- 
tributed and y" ^ y = (Fi, . . . , Ym) in ^^(X). It follows that_(yi, . . . ,Ym) 
are jointly normally distributed and C{YjYk) = lim„_j.oo CiY^Y^) (actually 
one has to argue for real- valued random variables and then pass to complex- 
valued ones separating real and imaginary parts). 

As for (2), we have for arcs /, /i, /2 C T, /i fl /2 = 

E{\Z{I)\'') = {xi,Xl)L^da)='y{I)- 

and 

E{Z{h)Z{l2)) = {Xh,Xh)LHda) = 0, 

For (3), let / be a continuous (and hence uniformly continuous) func- 
tion on T. Take a partition T in a family of left-closed, right-open disjoint 
intervals {Ik}- Set A^ € Ik and notice that 

k \ k / 

When maxyfc |/fc| ^ (/c — > oo), the argument of U goes to fj, f{^)dZ{.,^) (in 
L'^{dP)), and the left-hand side of the equality goes uniformly to /. Hence 



u- 



''f= f fiOdZ{.,0- 
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Writing this for / = Wn, we come to 

Xn= [ Wn{OdZ{.,0- 

The theorem is proved. D 

We continue with few remarks on the result. First, observe that we can 
prove a hkewise result for stochastic (varying) non-Gaussian processes drop- 
ping the first point of the theorem; see Lamperti [8] for the stationary case. 

Second, since we use only L^-convergence, the same proof goes through 
for a fixed a with the property 

(2.1) sup / Iwnl"^ da < CO. 

n Jj 

The condition means of course 

(2.2) supE{\Xnf) <oo. 

n 

in terms of the process X. 

Third, suppose two systems 2B and W' are related by (1.6), but 2If' does 
not necessarily have property (2.1). Of course. Theorem 2.1 does not work 
directly for 2H'-VG-processes. Nevertheless, applying an appropriate (non- 
stationary) linear filter to a 2H'-VG-process Y, we can get to a 2If-VG- 
process X, obtain its spectral representation from Theorem 2.1, and return 
back to the initial process Y using the inverse filter. In formulas: let D = 
Dw-^Wj D' = D^^ = D<2s'^w- Given a 2H'-VG-process Y, define the 
filtered process X (compare to (1.5)) 

k 

(2.3) X, = Y,d'skYs. 

s=0 

An easy computation shows Cn,x. = D'*Cn,YD', and the identification with 
(1.6) shows that X is a 2IJ-VG-process. By the spectral representation the- 
orem (i.e.. Theorem 2.1) X^ = jjWk{C)dZx{-,0- ^^ the other hand, 

k 

(2.4) Yk = Y,dskXs, 



s=0 



and 



(2.5) n = ^ [ f]4fc^.(0] dZ^{.,0 = l^w'„{OdZx{.,0- 

Hence, speaking a bit loosely. Theorem 2.1 can be extended to VG-processes 
which are "similar", in a proper sense, to processes satisfying its initial 
assumptions. 

2.2. Forward, backward and forward /backward prediction prob- 
lems for VGPs. We start with a bit of terminology. Let 2n = {wk}k=o,...,oo 
and a G M(T). Since the family W is linearly independent on T, it is also 
linearly independent in L'^{d(j). The (generalized) orthogonal polynomials 
w.r.t to 2H are denoted by {^Pk}, \\Vk\\L'^(dcr) = 1- They are well-defined and 
obtained by the Gram-Schmidt orthonormalization procedure from {wk}k 
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in Lp'{da). The (generalized) reversed orthogonal polynomials {^%} w.r.to 
2IT (warning: the notation is natural but slightly abusive!) are defined 
by orthonormalization procedure to satisfy 99^ € ^^fc) ||</'fellL2(d(T) = 1) a-iid 
{cpl.,Wj) = for j = 1, . . . , fc. Equivalently, on can get {(p^,} orthonor- 
malizing {wk,Wk-i, ■ ■ ■ ,wo} and taking the last element in the obtained 
sequence. The corresponding monic (also abusive!) orthogonal polynomials 
are denoted by {^k}, {^l}- 

When 2n = {wk}kez is a bilateral system of functions, we speak about 
(generalized) orthogonal Laurent polynomials (GOLP, GOL-polynomials) 
{Xk}k=o,...,oo (alternatively, {xk}k=o,...,oo) rather than "usual" ones. They 
are defined by the Gram-Schmidt procedure from the sequence {wq, wi,W-i, 
W2,W-2, . . . }. The sequence {xk}k comes from orthonormalization of {wqjW^i, 
wi,W-2,W2, . . .}. The monic versions of these sequences will be denoted by 
{xT}k and {xf}k. 

We are interested in one step prediction problems, see Bultheel et al. [4, 
Ch. 12] for the details. To formulate the problem, let <Yo,n-i = lin{Xk : k = 
0, . . . , n — 1} and Xi^n = lin{Xk : k = 1, . . . , n}. The forward prediction 
problem is to compute X„ = (X„|Afo,n-i) = Prx^f^_-^Xn and the backward 
one targets Xq = {Xo\Xi^n-i) = Ptxi^Xq. The projections are understood 
in the L^((iP)-sense and (.| ...) is a probabilistic notation for the object. 
Recall that the projections can be interpreted as conditioned random vari- 
ables. We also look at corresponding innovation processes. En = Xn — Xn 
and Eq = Xq — Xq. 

By the spectral representation theorem (i.e.. Theorem 2.1), this is equiva- 
lent to the computation of projections Wn = -P^Wo,„-i^^n and wq = Pryy^^tiJo, 
where >Vo,n-i = lin{wk : k = 0, ...,n — 1} and Wi,n = lin{wk : k = 
1, . . . ,n}. It is not difficult to see that, by definition, Wn = Wn — ^n and 
wq = wq — $* etc. Recalling that X„ = 2"Xo, we get back to the process 
X. This gives 

(2.6) Xn = (2" - $„(2))Xo, En = $„(^)Xo, 

Xo = (/ - Ki.Z))XQ, Eq = K{Z)Xo. 

By mixed prediction problem (or backward-forward prediction problem) 
we understand an estimate of the present from a part of the past plus a 
part of the future. The construction goes along the same lines as above, 
that is why we give the formulation of the problem followed by its solu- 
tion. So, let X_n-i;i,n = lin{Xk : k = -n, ...,n, k ^ 0},X_^n+i)-i;i,n = 
lin{Xk : k = — (n + 1),... ,n, A; / 0}, and X-n,-i;i,n+i = lin{Xk : k = 
—n, . . . ,n + 1, k ^ 0}. The different mixed prediction problems are: find 

-'^O = (-'^o|'^-n,-l;l,n))-'^0 = (-'^o|'^-n,-l;l,n+l), and Xq = (-^o|'^-(n+l),-l;l,n) 

and expressions for corresponding innovation processes. The first case can 
be treated with the help either {xk}k or {xk}k, the second case suggests the 
use of {xk}k, and the third one - the application of {xk}k- We obtain 

-'^O = (-^ - XTn)i^)^0, Eq = Xq - Xq = X^rii^)^^^ 
Xq = {I - X2n*+l){^)Xo^ Eq = Xq - Xq = X2n*+li^)Xo, 
Xq = {I - X^!^i)(Z)Xo, Eq = Xq - Xq = X^n*+li^)Xo, 
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3. Orthogonal rational functions and a special class of 
moment problems and vg-processes 

The point of this section is to ihustrate the construction from Sections 1, 
2 with the help of the so-caUed orthogonal rational functions (ORFs, to be 
brief) on the unit circle T. The monograph by Bultheel et al. [4] gives the 
state-of-art of the subject. A shorter overview is in Bultheel et al. [5]. The 
paper Baratchart et al. [1] contains some recent advances on the topic; our 
notation follows the last paper. 

3.1. ORFs and generalized moment problems. Let {Q;fc}fc=o,...,oo)CKo = 
be a given sequence of B and, as in [1], 

(3.1) J^(l - lafcl) = +c^. 

k 

We suppose for simplicity that at / aj^k / j; it goes without saying that 
the general situation can be treated along the same lines. Let 

t- ^ 

Ck{t) = - ^, So = l, Sfc = nO' 

1 - afci -^ . 



i=i 



for A; > 1. Furthermore, let 



Cn = lin{Bk : k = 0, . . . ,n} = lin I ^ — : A; = 0, . . . , n 

yi-akt 

We consider the following systems SUj = {wik}-, i = 1, 2, 3: 

(1) wio = 1, wik = Bk, /c = 1, . . . , n, 

(2) W2Q = 1, W2k = z ^-, /c = 1, . . . , n, 

1 - akt 

4-k 

(3) w^Q = 1, w^k = :pfs^--— — 7, k = l,...,n. 

lIj = lV-'- "j'-/ 

Obviously, every system 2Bj forms a basis of Cn- The first system is natural 
in applications to VG-processes with the property sup„ E{\Xn\'^) < oo. The 
second one is from [9, Ch. 3, Sect. 2] and appears readily from interpolation 
(e.g., Schur-Nevanlinna-Pick) problems. One of its lacks though is that 
the condition (2.1) for any a G M(T) is satisfied iff the sequence {ak} is 
compactly contained in D. To fix this, one can consider the system 2172 

^^20 = 1' ^2fc = -, ^T' k = l,...,n 

1 - akt 

instead. The third system is from [4, Ch. 9] and it has nice invariance 
properties w.r.to *-operation. 

As the following proposition shows, the systems are completely equivalent 
to our purposes. 

Proposition 3.1. We have: 

(1) The change- of -basis matrices taking one 2Hi to another are upper 
triangular and non- degenerate. 

(2) The systems 2Hj satisfy assumptions (l)-(3) and (4) on pp. 5, 7. 
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Proof. First, observe that 

k ^ 

(3.2) wik = Sk = Aok + y2= y^ , S i^ 

where By. -, = Ylgy^j Cs) ^^^ ^ofc is an easily computable constant. The proof 
is by inspection of residues of the LHS and the RHS at point {1/aj}. We 
also have Cfc = J-( ;^_^g ^ — 1), where pk = {1 — |afc|^) and k > 1. So, the first 
claim of the proposition is true for the passage from (2) to (1). 
To go from (3) to (2), we write 



W3k 



Bok + / , 



«.ns^,(«.-«.)) ^-"^■*' 



the proof being similar; i^ofc is a constant once again. The rest follows 
from the group properties of the square upper triangular matrices (with the 
non-degenerate diagonal) . 

This shows that it is enough to prove the second claim of the proposition 
for any of Wi, the rest will enjoy the same properties. The key (and ele- 
mentary) point is the following very well-known fact (see, for instance, [9, 
Ch. 3, Sect. 2]): a trigonometric polynomial is positive on T iff it can be 
represented as an analytic polynomial times its conjugate, i.e., 

n n 

(3.3) < ^ ttjt^ + a~P = Y^ bjt^ 

We go through the (easy) proof of (2) of the current proposition for the 
system 2Hi. (1), p. 5 being clear, we notice that, for j < k 

WljWik = Cj + i • • • Cfc, 
and relation (2), p. 5 follows from (3.2). As for (3), p. 5, let 

n 

'^iukSk + UkBk) > 

k 

on T for some {cfc}. We can rewrite the latter expression as /(i)/!!^ 1^ ~ 
cifctp, where / > on T is a positive trigonometric polynomial. Relation 

(3.3) implies that f = p - p, where p is an analytic polynomial, degp < n. 
Hence, 

n 

'^{akBk + akBk) = h-h, 

k 

where h = p/ Yl^i^ — cifct), which is definitely in £„. 

As for (4), p. 7, we see that (3.1) says that lin{Cn + Cn '■ n} is dense in 
^(T) -|-^(T), which is, in turn, dense in C(T). The separation property and 
the proposition follows. D 

Suppose now that a G M.{T) and c\ = Cqj^ = jj Wikda, or, more generally, 

(3.4) ck = / WijWikda, 
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for i = 1,2,3. Of course c*-^ = c|,-. The corresponding generalized moment 
(Toeplitz) matrix looks like 



/^i 
<-/„ 





Cqi 

Cll 


Cl2 • 


•-On 


CnO 









>0. 



As Proposition 3.1 shows, the matrices C^ are conjugated by non-degenerate 
upper triangular matrices. 

From now on, we deal mostly with C^ and C^. The same development 
works for C^ but is rather cumbersome, that is why we do not give it. Recall 
that c\j. = Cqo- Put Bjk = ns=i+i Cs and notice that 

(3.5) c]fc = / wijWikda = / Bjkda, 

Jt Jt 

and, with the help of (3.2), 

k ^ 

Bjk = Ajk + 2^ ====Cs, 

s=j+l ^jkMs) 

where Ajk is a coefficient (which is easy to compute) and Bjk^s stays for the 
product Bjk with dropped s-th factor. So, 



c]. 



k ^ 

(3-6) c]k = A,k + Y. ^=^^ 

s=j+l '^jk.sy^s) 

Hence, if matrix C\ is defined by relations (3.4), we have C\ = C^ > and 
it conforms to relations (3.6). It is natural to ask the inverse: is it true that 
a non-negative matrix C^ satisfying relations (3.6) can be represented as a 
Toeplitz matrix of a measure a € M{T) (i.e., satisfies (3.4))? 

The answer is yes and it is given by a straightforward consequence of 
Theorem 1.1. 

Proposition 3.2. The following assertions are equivalent: 

- There is a C^ = C^ > satisfying relations (3.6) for all j, k. 

- There is a measure a £ A^(T) generating C^ through relations (3.4) 
for all j, k. 

With little changes, the same proposition holds for 2112, the only difference 
is that one has to use 



1 \ 1 1/1 1 

+ z — -1 



1 — ajt J 1 — a^t 1 — ajttk \^ — oikt 1 — (Xjt 

instead of (3.5). 

For the sake of completeness, we give a criterion for the solvability of the 
moment problem for {c\}. Up to inessential details, it is borrowed from [9, 
Ch. 3, Sect. 2] 

Proposition 3.3. The moment problem (see (3'), Theorem 1.1) is solvable 
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(1) ForWi 



1 



1 - akOj 
ForW2: 



{pjOkcl + PkUjc] + Co(l 



jafcl^loi 



')) 



>0 



j,k 



ct + cj 



akttj 



>0 



j,k 



The remark on the hnk between (/. and 1/(1 — a^t) next to (3.2) implies 
that c^ = — (ojfcC^ + Cq), so the first claim fohow from the second one. 

3.2. ORFs, Blaschke-VG-processes and prediction problems. Once 
again, the purpose of this subsection is to specify the content of Sections 
2.1, 2.2 to the special case of system 21Ji. The construction for 21Jj, i = 2, 3, 
is completely analogous and we leave it to the interested reader. 

So, let {ak}k be a sequence of points in D satisfying the assumptions 
of the previous subsection. Let X be a varying Gaussian process (see the 



a 



C{Xn, ■ ■ ■ ^n+mj IS a 



definition in Section 2.1) such that C 

generalized Toeplitz matrix w.r.to {wj^ 

to be brief). Proposition 3.2 implies that there exists the unique a £ Ai{T 

such that 



n,...,n+m 

, wj^_^_^} (a Blaschke-VG-process, 



L' — (^n,..., 



n+m 



WijWikda 



j,k 



Notice that C = C* > and it satisfies relations (3.6). This means that we 
can calculate Cn,...,n+m+i having Cn,...,n+m and the element Cn+m,n+m+i- 
Indeed, just apply (3.6) to recover the last column and the last row of the 
"new" matrix Cn,...,n+m+i- Hence, to get, say, Cn+i,...,n+m+i from Cn,...,n+m, 
do the above step-by-step procedure to come to Cn,,,,^n+m+i and then throw 
away the first I columns and rows. A counterpart to Theorem 2.1 in this 
situation is as follows. 

Proposition 3.4. Let X 6e a Blaschke-VGP and a be its spectral measure. 
Then there exists a unique family of random variables Z^ = Z(.,^),^ € T 
with the properties: 

(1) For {$.1, ■ ■ ■ ,Ck},Cj £ 1',^j / ^j, the random variables {Z(^.}j are 
jointly normally distributed. 

(2) For I = [1^1,^2) C T, one writes Z{I) = Z^,^ — Z^^ and 



E{\Z{I)\^) = a{I), EiZih)Z{h)) = 0, 



for h n I2 
(3) Finally, 



Xn= [ Bn{OdZ{.,0- 



Now, the generalized orthogonal polynomials {{pk,(pl.,^ki^k} appearing 
in Section 2.2 are precisely orthogonal rational functions (denoted by the 
same letters) studied in [4, 5, 1]. So, the forward and backward prediction 
problems from Section 2.2 for Blashke-VG-processes are solved precisely by 
the ORFs. The asymptotics of the ORFs {(pn, Vnl describe the behavior of 
the predictors for n — )■ 00. The study of these asymptotics is an interesting 
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(and challenging) analytical problem; this is the main purpose of the paper 
Baratchart et al. [1], where, in particular, a special attention is paid to the 
measures o from the so-called Szego class. 

In the rest of this subsection we extensively use the terminology and 
notation of [1]; see especially [1, Sect. 5]. To give a sample of application 
of results of the work, let us turn back to relations (2.6). Recall that En = 
Xn — Xn is precisely one-step ahead prediction error and its "energy" is 

En = E{\En\^f/' = E{\Xn - Xn\^f/' = — f^. 

Proposition 3.5. Suppose that assumptions of[l], Theorem 5.8, hold true. 
Let 'X. be a Wi - VG-process and a he its spectral measure (lying in the Szego 
class). Let {am„} he a subsequence of {««} o-nd lim„_!.oo Om„ = a G B. 



For a G D, 
For a G T, 



lim^ 
lim,i 



E 



^(1 

0. 



a 



2U/2 



\S{a)\. 



The proposition says, in particular, that in the second case the corre- 
sponding subprocess is asymptotically deterministic. 



4. ARMA-PROCESSES AND THEIR VARYING GAUSSIAN COUNTERPARTS 

4.1. Reminder on "classical" ARMA-processes. ARMA-(p, g) (i.e., 
autoregressive moving average) processes form a subclass of stationary Gauss- 
ian processes. They play an important role in applications and are widely 
studied. This subsection recalls some basic definitions and results on the 
topic. Its content is borrowed from Brockwell-Davis [3, Ch. 3, 4], which 
extensively discusses different issues pertaining to the subject. 

Let Z = {Zn}n be a SGP with covariance matrices C = {Cn}n of the 
form Cn = diag{5^} and zero mean. The process Z is called a white noise 
(WN((52,0), to be brief). 

Let X be a SGP with covariance matrices given by 



(4.1) 



Or) 



[^k—j\j,k=0,...,n 



Co Ci 

ci ■•• 



Co Ci 
Cl Co. 



where, as always, c_s = c^. 

Let Z' be a shift operator acting as Z'X^ = Xn-i- For polynomials 
ip,6,deg(p = p,deg6 = g, X is called an ARMA-{p,q) process [3, Ch. 3, 
Sect. 1], if it satisfies the equation 

ip{z')Xn = e{z')Zn, 

where Z is a WN((5^,0). Notice that if the operator ip{Z') is invertible 
(in some sense), we can rewrite the above equation formally as Xn = 

^{z')-^e{z')Zn. 

Assuming now that V' = {V'j} € l^, we see that 

oo 
y„ = i){Z')Xn = ^ ^jXn-j 
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converges in L'^{dP), [3], Proposition 3.1.1, and hence is well-defined. Set- 
ting Y = {Yn}, we compute readily its covariance matrices [3], Proposition 
3.1.2, 

oo 

(4.2) Cy = {Cn}n, Cn = [ck-j]j,k=0,...,n, Ch = 2_^ Ipji'kCh-j+k- 

j,k=—oo 

We see, in particular, that Y is a GSP. We say that X is casual, if X„ = 

Z^j=oi^j^n~j- 

Theorem 4.1 ([3], Theorem 3.1.1). Let X be an ARMA-{p,q) process, 
ip(Z')Xn = 9{Z')Zn and polynomials {p,6 do not have common zeroes. Then 
X is casual iff the zeroes of (p lie in {\z\ > 1}. In this case, 

oo 

(4.3) Xn=i^{Z')Zn= Y. ^jZn-j, 

j = -OD 

where 

S{z) 



j=-oo 



for \z\ <1. 



Furthermore, the spectral measure for the above process X can be easily 
written down, [3], Theorem 4.4.2, 

(4.4) da^(t) = 5^^^§^dmit), 

where t = e*''^ G T and dm{t) = 2:^ = ^d(j) is the normalized Lebesgue 
measure on the unit circle. Moreover, if Zn = jjS,'^dZz{-,S,), we obtain the 
spectral decomposition for X 

(4.5) X„= fcd^dZzi.,0, 
see Theorem 0.3. 

4.2. Varying analogues of ARMA-type processes. Here, we rewrite 
the results of the previous subsection for varying Gaussian processes as in- 
troduced in Section 2. We think especially of the VG-processes satisfying 
formula (4.3) w.r.to a white noise Z. Of course, there are no mathematical 
reasons to assume that the coefficients {V'j} come from a rational function, 
but since the situation is important in practice, we keep it in mind and 
do some comments on the issue. The corresponding VG-processes will be 
called varying ARMA-type processes (2B-VARMA- or VARMA-processes, 
to be brief). 

To start with, let 2H be a general system satisfying usual properties (see 
the beginning of Section 1, p. 5). For a VG-process X we always assume (2.2) 
and that the operator Z' = Z^^ , Z being defined right after the formulation 
of Theorem 2.1, is bounded, i.e. \\Z'\\ < R for some R > 1. Sometimes we 
require that ||2^'~^|| < R, too. These assumptions on Z' (Z'^^) can be 
easily verified, for example, for some systems of ORFs (i.e., {uk} compactly 
contained in D and cim < a < C2m on T with ci, C2 > 0). 
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For instance, we may say that X is a varying white noise process (VWN), 
if da = dm, a being its spectral measure, see Theorem 1.1 and the discussion 
at the beginning of Section 2. 

Let (/9, 9 be polynomials of degree p and q, respectively. We say that Y 
is VARMA-process w.r.to a 2B-VG-process X, if Lp{Z')Yn = e{Z')Xn. More 
generally, a process Y is -(/'-VARMA-process w.r.to X, if 



(4.6) 



Yn 



oo 

E 



IpjXn- 



■J' 



where V' = {'4'j} is so that 

(4.7) 



E l^il^''' < 



oo. 



j = -oo 



Notice that Y is not a 2IJ-VG-process in general, and the results of Section 
2 do not apply. Nevertheless, we are able to obtain conclusions similar to 
Theorems 1.1, 2.1 for these processes with the help of filtering tricks (2.3)- 
(2.5) starting from the "reference" VG-processes X. We also note that the 
introduced classes of VARMA- (-;/'- VARM A- ) processes are exactly properly 
filtered VG-processes. The filters are of course linear and stationary. 

It is plain that conditions (2.2) and (4.7) imply that the sum (4.6) con- 
verges in L^(dP) and Y is well-defined. As before, we say that Y is casual 
w.r.to X, if Yn = Ylf=oi^j^n^j- 

The covariance matrices Cy = {Cn,Y} are easy to compute. Indeed, one 
has 



[... y_i Yo Yi 



X_i Xn Xi 



IpO Ipl lp2 
iIJ-2 1p-l IpO 



and, with ^ denoting the above matrix, 

Cn,Y = ^*C„,x1', 

compare to (4.2). 

Proposition 4.2. Let y be a VARMA-process w.r.to a W-VG-process X, 
and \\2'\\ < R. Suppose that polynomials (p, 9 do not have common zeroes. 
Put 



^(z) 



ip{z) 



Yl ^i^^- 



j=-oo 



// the zeroes of ip are in {\z\ > R}, Y is casual w.r.to X and Yn = 

Turning to the spectral part of the matter, we see that spectral charac- 
teristics of Y defined by (4.6), are readily expressible in terms of spectral 
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parameters of X. In fact, one has 

'''^ \s=-oo / \p=-oo / 

where E{XkXj) = Jjiu^Wj da and o" is a spectral measure of X. Further- 
more, 



^ y=-oo J 



where Xn = jjWn{0dZ{.,6.) is the spectral representation for X. It is 
instructive to compare the last displayed formulas to (4.4) and (4.5). 
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